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Gravitational lensing of the cosmic microwave background (CMB) polarization field has been rec- 
ognized as a potentially valuable probe of the cosmological density field. We apply likelihood-based 
techniques to the problem of lensing of CMB polarization and show that if the B-mode polarization 
is mapped, then likelihood-based techniques allow significantly better lensing reconstruction than 
is possible using the previous quadratic estimator approach. With this method the ultimate limit 
to lensing reconstruction is not set by the lensed CMB power spectrum. Second-order corrections 
are known to produce a curl component of the lensing deflection field that cannot be described by 
a potential; we show that this does not significantly affect the reconstruction at noise levels greater 
ff) , than 0.25 /iK arcmin. The reduction of the mean squared error in the lensing reconstruction relative 

to the quadratic method can be as much as a factor of two at noise levels of 1.4 fiK arcmin to a 
' factor of ten at 0.25 )j,K arcmin, depending on the angular scale of interest. 
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PACS numbers: 95.75.Pq,98.65.Dx,98.80.Es 



I. INTRODUCTION 



Over the past decade the cosmic microwave background (CMB) anisotropy has been established as a robust and 
powerful cosmological probe. While much attention has focused on the primary anisotropy generated in the early 
universe, the CMB should also contain signatures of processes that occurred between the surface of last scatter and 
the present. One of these is weak gravitational lensing, which has been recognized as a probe of large scale structure 
l/S ■ (LSS) P, 0, IS 0, HJ. Aside from its use as a probe of the matter power spectrum at low redshift z <C 1100, weak 
lensing of the CMB could be cross correlated against other tracers of the density field such as galaxy surveys |(j or 
weak lensing of galaxies Through cross-correlation with the CMB temperature, an improved measurement of the 
integrated Sachs- Wolfe effect over that possible using the CMB power spectrum alone is possible, yielding constraints 
on the late-time growth function and hence on the dark energy 0, • Lensing has also attracted attention recently 
as a cosmological source of B-mode polarization [f| ; reconstruction and removal of lensing B-modes will thus be an 
'■^ ' important part of a future search for .B-mode polarization induced by primordial gravitational waves [Tol TTTL . 

The lensing signal in the CMB is small, so it is important to construct optimal methods for estimating the lensing 



field from CMB data. The early investigations of lensing of the CMB temperature showed that while there is an effect 



igations ot 

of lensing on the CMB power spectrum |l3lll^ |. it is much more promising to estimate the lensing field using quadratic 



combinations of the CMB temperature, and to estimate the lensing power spectrum using the four-point correlation 
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function (or its harmonic equivalent, the trispectrum) 0,0)^3- More recent work has identified the divergence of 
the temperature-weighted gradient as the optimal quadratic combination of the CMB temperature for use in lensing 
studies p], IHI . Analysis based on likelihood techniques |17| has since shown that the quadratic estimator technique 
is statistically optimal when the lensing effect on the CMB covariance matrix is small. This was shown to be a good 
approximation for lensing of temperature anisotropies in the range I < 3500. For the small scales I 3> 3500, the 
primary CMB power spectrum is much smaller than the lensed power spectrum, hence this argument breaks down. 
In this case for sufficiently small instrument noise the reconstruction of projected mass density can be nearly perfect 
[Tc|. We will not discuss the reconstruction on these very small scales in this paper. 

Our ability to reconstruct the lensing field using the CMB temperature is limited because the temperature fluc- 
tuations are stochastic and so we can only statistically determine the unlensed CMB temperature field. It is thus 
advantageous to consider lensing of the CMB polarization, since in the absence of primordial gravitational waves the 
unlensed CMB polarization is entirely in E rather than B modes. This implies that, in the terminology of galaxy 
lensing, there is no "shape noise" in the CMB polarization field. Several authors have developed algorithms that use 
the -B-modes induced by lensing to probe LSS |5l Il9|. The optimal quadratic estimator - the polarization analogue 
of the temperature-based quadratic estimator using the divergence of the temperature-weighted gradient - was con- 
structed by Ref . ^(| • There it was shown that for sufficiently small detector noise most of the lensing reconstruction 
information with this method is provided by the B mode. 
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TABLE I: Parameters for CMB experiments. 



Experiment 


Mp / /iK arcmin 


Ofwhm 1 arcmin 


lc 


WMAP, 4 yrs. (94 GHz) 


700 


13 


620 


Planck, 1 yr. (143 GHz) 


81 


8 


1010 


Ref. Expt. A 


3.0 


7 


1160 


Ref. Expt. B 


1.41 


7 


1160 


Ref. Expt. C 


1.41 


4 


2020 


Ref. Expt. D 


1.00 


4 


2020 


Ref. Expt. E 


0.50 


2 


4050 


Ref. Expt. F 


0.25 


2 


4050 



Even with polarization information these quadratic estimators cannot improve the reconstruction beyond a certain 
level, set by the coherence length of the polarization. It has been argued that this provides a fundamental limit to our 
ability to separate the lensing induced B modes from the B modes induced by gravity waves [Til fl^| ■ However, it has 
not been determined whether quadratic estimation is optimal for the polarization-based lensing reconstruction, and 
indeed Refs. [Ill [12J comment that it might be possible to extract additional information in higher-order statistics. 
The argument for optimality of the quadratic estimator presented by Ref. [l7| does not apply to polarization since 
the -B-mode power is dramatically increased by lensing. Here we construct likelihood-based estimators for lensing 
using the CMB polarization and show that the likelihood-based estimator improves significantly on the quadratic 
estimator (although we do not present these as series of higher-order statistics). Indeed, as noise is decreased the 
accuracy of CMB lensing reconstruction continues to improve without bound. Conceptually this is because if the 
lensed polarization is measured with zero noise, then the equation B U niensed = can be solved (except possibly for a 
small number of degenerate modes) for the projected matter density with zero noise. The equation -E> U nicscd = is 
ill-behaved in the presence of instrument noise; fortunately, the likelihood formalism easily incorporates noise and, as 
we show in this paper, regularizes the problem. 

In practice, a perfect reconstruction of the lensing potential is impossible because as instrument noise is reduced, 
some contaminant to the lensing signal will eventually become more important than the instrument noise. One 
candidate for this limiting factor is lensing field rotation caused by the fact that the density perturbations causing 
the lensing are spread out along the line of sight (i.e. there is more than one "lens plane") and that the lensing is 
not perfectly weak (i.e. the first-order Born approximation to the lensing field is inexact). We will show that even for 
an experiment with noise 0.25 /iK arcmin and 2 arcmin full-width half maximum beam, the field rotation does not 
substantially worsen the lensing reconstruction. It is however possible that foreground contamination will be a more 
serious problem. 

Studies of lensing of CMB polarization will require that the polarization field be mapped with noise levels 
of the order of ~ 1 /iK arcmin; this would be a substantial improvement in sensitivity beyond that of the 
current Wilkinson Microwave Anisotropy Probe (WMAP; http://map.gsfc.nasa.gov/) and the upcoming Planck 
(http://astro.estec.esa.nl/Planck/) experiments (see Table QJ. Nevertheless, ~ 1 uK arcmin may be achieved with 
a future polarization satellite. The noise levels of <0.25 /xK arcmin at which field rotation becomes important will 
probably remain unachievable for the foreseeable future. 

This paper is organized as follows: in flTTl we define our notations and conventions. In flllll we consider the properties 
of the likelihood function and its implications for likelihood and Bayesian analyses of CMB lensing reconstruction. 
In ijlVI we investigate the breakdown of the Born approximation for CMB lensing and its implications for lensing 
reconstruction. Determination of the lensing power spectrum from CMB maps is discussed in ^3 We show numerical 
simulations of CMB polarization lensing and reconstruction using our estimators in WII and conclude in Will 

The fiducial cosmology used in these simulations is a spatially flat cosmological constant-dominated universe with 
baryon fraction tl^o = 0.046; cold dark matter fraction Q c q = 0.224; cosmological constant fraction fl\o = 0.73; 
Hubble parameter Hq = 72 km/s/Mpc; primordial helium abundance Yp = 0.24; reionization optical depth r r = 0.17; 
primordial scalar spectral index n s = 1; and no primordial vector or tensor perturbations. We have used the CMBFAST 
numerical package 20] to compute all power spectra except in illVBI The experiments considered are as shown in 
Table [Q The WMAP and the Planck will not be able to map the lensing field using polarization and are included 
in the table for comparison. The Reference Experiments A through F are successively lower-noise (or finer-beam) 
experiments that were analyzed to determine how the signal-to-noise ratio in the lensing reconstruction depends on 
experimental parameters. Note that Experiment C is the Reference Experiment of Refs. [TollTH. 



3 



II. FORMALISM 

Here we describe our normalization conventions; note that for some quantities, there are many conventions in use 
in the literature, and appropriate conversion factors must be applied if one wishes to compare results. 

A. CMB 

We work in the normalized flat-sky approximation, i.e. the sky is taken to be a flat square of side length \/47r 
(i.e. total area 4-7r) with periodic boundary conditions. The CMB temperature and polarization fields can then be 
expressed as a sum over Fourier modes: 

( T(n) \ , ( T\\ 

(1) 

where the 1-modes are distributed in the two-dimensional 1-space with number density 1/tt. Defining the angle of a 
mode by tan</>i = l y /l x , we have E and B polarization modes given by: 




E x \ _ I cos(20i) sin(20i) 
Bi ) ~ I - sin(2(/>i) cos(2(^i) 




(2) 



(Technically the angle 4>o of the 1 = mode is undefined, however this will not concern us since within the flat-sky 
approximation we will convert sums over 1 into integrals: — > J d 2 l/ir. If an integral is divergent at 1 = 0, then it 
cannot be computed accurately within the flat-sky approximation.) 

We will use the following notations for CMB fields: {T,Q,U} for the unlensed (primary) CMB anisotropies; 
{T, Q,U} for the lensed CMB anisotropies; and {T, Q, U} for the measured anisotropies (including noise). These are 
measured in /iK (blackbody temperature), and we will assume that the monopole (mean temperature) and special- 
relativistic effects (kinematic dipole/quadrupole and stellar aberration) have been removed. The instrument noise 
will be assumed to be statistically uncorrelated with any cosmological signal and will be denoted by rjx, where X is 
one of T, Q, or U (or T, E, and B depending on which basis is more convenient). The unlensed CMB will have a 
power spectrum given by: 

{X?X{,)=C{ {X '5 1 , l ,, (3) 

where here X and X' are T, E, or B (here we desire rotational symmetry so we cannot use Q or U). We assume the 
universe is statistically parity-invariant so that Cj B = Cf B = 0; in some parts of this paper we will discuss universes 
with no tensor perturbations, in which case we also have C BB = 0. 

Throughout most of this paper we will take general noise covariance N; when we wish to show expected performance 
for particular experiments, we will use the following noise power spectrum appropriate for a Gaussian beam profile: 

N TT _ j^2 e l(l+l)e 2 FWHM /8ln2 _ j^ e l(l+l)/l c (l c +l) ^ ^ 

where Qfwhm is the full- width at half maximum (FWHM) of the beam. We take a similar form for N EE — N BB , 
except that Nt is replaced with Mp. The quantities Nr, Afp, and Ofwhm (combined with the fraction f s k y of the 
sky surveyed) thus parameterize the performance of the experiment. Noise curves compared to the CMB for the 
experiments shown in Table [I] are shown in Fig. ^ The Z-value at which the beam transfer function drops to is 
given approximately by l c « (8095 aremm) / 9 pw h m ■ 

We will introduce a vector x containing both temperature and polarization information: x = (T, Q,U). The lensed 
and measured temperature/polarization vectors will be denoted x and x, respectively. [Note: x is a "vector" in the 
sense of linear algebra, i.e. it is an element of a vector space, in this case a Hilbert space with the usual L 2 (S 2 ) inner 
product, on which matrix operations such as C can act. It is not a vector in the sense of differential geometry.] Since 
most of the fields we deal with, including CMB temperature and polarization, are real, their Fourier modes satisfy 
e.g. T\ = T*j. Consequently, if we have N Fourier modes, the A^-dimensional vector with components {T{\ only has 
N/2 independent complex components; the remainder contain redundant information. (Of course, there are still N 
independent real components.) The covariance matrix C is defined as C xx — (xx'); note that it is Hermitian by 
construction. 
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CMB polarization and noise power spectra 
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FIG. 1: CMB polarization power spectra for 15-type and B-type polarization (upper and lower solid curves, respectively). The 
noise curves for the experiments of Table U] are shown as dashed lines; from top to bottom: WMAP, Planck, and Reference 
Experiments A, B, C, D, E, and F. 

B. Lensing 

The lensed temperature and polarization are given in terms of the unlensed temperature by means of the re-mapping 
function g: T(n) = T[g(h)], and similarly for Q and U [however, E and B do not transform this way, rather one must 
use Eq. The re-mapping function can be decomposed into a longitudinal part generated by the lensing potential 
$, and a transverse part generated by the lensing cross-potential Q: 



g(h) = n + V$(n) + *VO(n) 



(5) 



where * is the ninety-degree rotation operator: *e x = e y , *e y = —e x . Past studies of CMB lensing reconstruction 
have ignored the cross-potential since (for scalar perturbations) it vanishes at first order in perturbation theory. In 
principle it could become important given the high precision enabled by lensing of CMB polarization. However, we 
will show in HTV\ and WI Al that the cross-potential is unimportant for most near-term experiments. 
We restrict our attention to the weak lensing regime, i.e. we assume the magnification matrix: 



M 



/ 9fc 


dg* \ 




/ dx 


dy 






day 




\ dx 


dy ) 





d 2 x <S> - d x d y Q 
d 2 M 



d x d y $ 



1 



d x dy$ 



-3^ 

d x d y £l 



1 + K + 7Q 7(7 + LO 

7c; - w 1 + k - 7q 



(6) 



is everywhere invertible. This is a necessary and sufficient condition to disallow caustics and multiple images of the 
same portion of the surface of last scatter. Lensing by large-scale structure is too weak to create caustics on the 
surface of last scatter, so the weak lensing assumption is violated only in the vicinity of astrophysical objects such as 
clusters. We classify clusters and other strong lenses as foreground contaminants and do not consider them further 
here. In the regime where the lensing distortion is small - i.e. M is close to the identity - we may interpret the four 
components k,jq,ju,u> in Eq. © as follows. The convergence k magnifies a feature on the last-scattering surface of 
infinitesimal angular size d"& to size (1 + Kjd'd. The field rotation angle u) rotates the feature clockwise by lo radians. 
The Q-shear 7q produces a stretching along the x-axis while compressing it along y: the apparent angular extents of a 
feature along the two axes are (1 + r yo)d'Q x and (1 — r )Q)d r d y , respectively. The [/-shear has a similar effect, stretching 
along the y — x axis and compressing along the y = —x axis. It should be noted that the four fields k,jq,ju,u 
are not independent because they are all generated by differentiating the two fields $ and f2. In particular, if we 
_E_B-decompose the shear field into its positive-parity (e) and negative-parity (/3) components: 



ei = [iq] 1 cos 20i + [ju] 1 sin 2<fr , 
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A= - [7q]i sin 20i + [7ir]i cos 2^1, (7) 

we find that ei = k\ and (3\ = u>i. These are then related to the potentials via k\ — (Z 2 /2)$i and ui\ — (I 2 /2)fl\. This 
immediately implies the power spectrum relations C ; KK = Cf e = -j/ 4 C** and Cf u = Cf' 3 = ^ 4 Cp°. It is of interest 
to note that the convergence and field rotation can be determined from the deflection angle d(n) = ,g(n) — n by: 

K = --V • d and uj = -V • *d. (8) 

2 2 w 

If k,7q,7c/,o; are not small compared to 1, then the physical interpretation of these quantities is somewhat more 
complicated. We will continue to call k the "convergence," (jq,Ju) the "shear," and uj the "field rotation angle" even 
in this case, although this convention is not universally followed in the literature. Note, however, that the relations 
e\ = k\ and Pi = u>\ continue to hold (in fact, they remain valid even for strong lenses!), which makes our definitions 
of convergence, shear, and field rotation angle particularly convenient. 

Sometimes we will specify a lens re-mapping g by its convergence and field rotation, g = (k, uj). Most authors have 
performed the lensing analysis using $ rather than n as the field to be estimated, since the deflection angle is a local 
function of the former. In the present analysis, we take K (and u> when it is important) to be the fundamental field. 
Of course, the two fields contain exactly the same information, being related by the differential relation k = — 1 jV 2 <I> 
in real space and by a multiplicative factor of I 2 / 2 in harmonic space. 

It is convenient to introduce the lensing operator A g defined by A g X(h) = X(g(h)), where X is one of T, Q, or 
U. [In the {T, E, B} basis, the action of A ff is not so simple and the transformation of Eq. (J2J) must be applied.] 
Wc define the a K differential operator (and analogously cr") as the action of an infinitesimal lens configuration: 
erf = (<9A s /<9Ki[g])| g= ( .o)- Then, noting that A(o,o) is the identity operator 1, we find that to first order in (k,u>): 

A( K , W ) = l + ^(KiCrr+WiCr 1 J ) + 0(K 2 ,W 2 ,Kw). (9) 
1 

The a operator acts as follows on a field X in the {T, Q, U} basis: 

KX)„ = (*) ^x„_„ 



47T 

2\ 1 ■ ★(!' - 1) 



[afX) v = U ) V ^ ' Xv-x. (10) 



In the {T, E, B} basis, the u-matrices mix E and B because these are non-local quantities. Specifically, they have 
components: 



2 



/ 1 

cos 2a -sin 2a | , (11) 
y sin 2a cos 2a 



where the rows correspond to the {T, E, B}\ ± and the columns to the {T, E, B}^\ 2 , and we have defined the angle 
a = 0ij —4>\ 2 . The matrix for differs by replacing the prefactor IT2 with *1-1 2 . The a matrices satisfy erf = (cr^j)' . 

Lensing alters the CMB anisotropy covariance; the covariance matrix C xx , where X £ {T,Q,U} (or X S 
{T, E,B}) of the lensed temperatures is dependent on the lens configuration g, and thus we will denote it by 
C xx = C xx . Since the lensed CMB field is x = A g x, we have C g — A g CAt. The lensed covariance averaged over 
the ensemble of LSS configurations will be denoted by (C xx }lss- Note, however, that whereas the primary CMB 
is expected to be nearly Gaussian, the lensed CMB is non-Gaussian and so (C xx ) lss does not specify completely 
the statistics of the lensed CMB field. Indeed, it is the non-Gaussianity of the lensed CMB that enables separation 
of the lensing and gravitational wave contributions to B. It also means that the standard Gaussian formula for the 
uncertainty in the power spectrum, o~{C{)/Ci — y/2/(2l + l)/ s fe J/ Ai, does not necessarily apply to i?-mode polarization 
on small scales. 

It is readily apparent from Eq. (|11|) that lensing can produce _B-modcs even if these are not present in the primary 
CMB. We show in Appendix iBl that for "almost all" primary CMB realizations, there are only a small number of 
convergence modes that do not produce B-type polarization. 
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C. Chi-squared analysis of lensing 

We illustrate our formalism with a simple lensing reconstruction via x 2 minimization (the "least squares" method) . 
We perform a full likelihood analysis in flIIII but the x 2 analysis is sufficiently similar that it illustrates the basic 
concept. Define the functional x 2 ( K ) °f a convergence field given CMB data x: 

X \n) = (A^ 0) x)t(C + N^A^x + «tc« (12) 

Here AT. -jX is the de-lensed CMB, i.e. we have taken the measured CMB and projected it back onto the primary 
CMB assuming that the lens configuration is given by convergence k with no rotation. To a first approximation, this 
should have covariance C + N since it is the sum of primary CMB and instrument noise. (The matrix C + N is equal 
to the measured CMB covariance in the absence of lensing and hence will frequently be denoted by Cm,o)- Technically 
the noise covariance is not exactly N because the noise has been de-lensed; see WI Bl ) We have thus chosen to define 
our x 2 a s the amount of power in this de-lensed CMB, with the various modes weighted according to their variance. 
The addition of the k' i C kk ~ 1 k term serves to regularize the problem by preventing the convergence from running off 
to oo in search of smaller primary CMB power. 

If we take the first-order approximation to A -1 given by Eq. (J5J, Eq. 112(1 becomes: 



X 2 {n) = X 2 (0) + 2 »"i «i + £ «i (Ay + Cf" )«i', (13) 
l i,i' 

where: 

X 2 (0)= xttC + N)" 1 *, 
mi= x t (C + N)- 1 a'! 1 x, 
Ay/ = xV^C + Nj-Vfx, (14) 

Note that m is a real vector and A is Hermitian. This is a quadratic function of k and hence it has a minimum that 
can be determined via standard techniques. The minimum is at: 

k„ = (A + C**- 1 )- 1 !!!. (15) 

The error covariance S x 2 of k is found by the usual method of setting x 2 ( K ) = X 2 ( K *) + i K ~ k *)^S~ 2 1 (k — ^*); this 
yields: 

S X 2 = (A + C^- 1 )- 1 . (16) 

The most important feature of this analysis is the reconstruction error, S x 2 . Note that as the instrument noise goes 
to zero, the matrix C + N develops null directions corresponding to the -B-modes. Therefore, (C + N) _1 has infinite 
eigenvalues in these directions, and if the number of convergence modes being reconstructed is less than or equal to 
the number of -B-modes measured, we have A — * oo and S x 2 — > 0. This leads us to the conclusion that the accuracy 
of convergence reconstruction is limited only by the sensitivity of the instrument and the presence of foregrounds or 
other contaminants, not by statistics of the convergence or primary CMB field. One can note that for zero instrument 
noise, the x' \ Eq. i|12|) . is infinite unless the de-lensed CMB field AT^^x has vanishing .B-modes, i.e. in this case the 
X 2 analysis is solving for -B un iensod = (except possibly for a few degenerate modes; see Appendix FBI . We extend 



this methodology to a full likelihood analysis in £ 11111 where we find that the general conclusions of this section remain 
valid. 



III. LENSING RECONSTRUCTION: LIKELIHOOD ANALYSIS 



In this section we explore the accuracy of reconstruction of lensing based on CMB temperature and polarization. 
We follow the analysis performed in Ref. f° r the CMB temperature; most of the analysis extends easily to 

polarization, with one exception: the primary CMB has very little (if any) i?-mode polarization. This means that the 
lensed CMB power spectrum (Cf B ) lss cannot be expressed as a small perturbation on the unlensed power spectrum. 
We also include the effect of the field rotation in our discussion of the likelihood gradient and Fisher matrix, although 
we do not construct a "practical" estimator for it. 
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A. Likelihood function and gradient 

For a given lens configuration with re-mapping function g, the covariance matrix C of the measured CMB is 
computed from: 

C a = (xxt) = ((x + 7?)(x + t,)t) = C a + N = A 9 CAt + N, (17) 

where N — (r)rf) is the noise matrix. The measured CMB is Gaussian-distributed if we assume that the primary CMB 
x and instrument noise 77 are both Gaussian. (Note: the assumption of Gaussianity only applies before we average 
over LSS realizations.) The (negative log) likelihood function C for a lens configuration with re-mapping function g 
is then given (up to an irrelevant constant) by: 

£(. 9 ) = ilndetC 9 + ixtC g - 1 x, (18) 

Now we wish to determine the likelihood gradient with respect to the lens configuration g = (k, u>). We will compute 
the gradient of the likelihood function, Eq. 118() , using Eq. (|17|) : 

The maximum- likelihood estimator is given by the relation dC/dn\ = 0. (We also require d£/du>\ = if we are 
estimating u> as well as n.) However, maximum likelihood estimation of the lensing field is generally unstable because 
the lensing field has too many degrees of freedom. In order to regularize the problem, we introduce a Bayesian 
prior probability distribution oc e~ p for g — i.e. we take prior probability dP oc e^ p ^ J}j dn\duj\. It is most 

convenient to take a Gaussian prior based on the power spectra of n and (if applicable) u>: 



(k,u) = i (K+C^^K + lndetC**) + - (uAC""" + lndet C^) 



2 ^ V C; / 2 ^ V C; 

\ 1 1 



where in the second equality we have assumed that the prior on k and uj is statistically isotropic. (Note that this 
assumes the power spectra are known; we will consider the problem of estimating Cf K from CMB data in §V\ The 
methods we present in Wl allow iterative determination of both the convergence field k and the power spectrum C ; KK .) 
If we are neglecting the field rotation then the terms involving u should simply be removed. The mode of the posterior 
probability distribution is given by minimizing C + p; we thus set dp/dn\ = —dC/dK\, or: 

[C-V]; = -T, (c-^jCAt) +i »C- J^jCA-C^. (21) 

Because of the presence of the prior C KK_1 , this estimator will filter out lensing modes that cannot be accurately 
reconstructed from the CMB data. It can thus be viewed as a sort of nonlinear generalization of the Wiener filter. 



B. Practical estimator for the convergence 

The likelihood gradient, Eq. (|19fl . and hence the convergence estimator Eq. (|21() based on it, are difficult to 
evaluate. We therefore investigate several approximations to the likelihood function. First, we consider only the 
convergence, k; the rotation u> will be shown in 31 VI to be unimportant unless instrument noise is very small. We note 
that Eq. I|21|l can be re- written as: 

[C—^lf = -Tr (At-iwV^CwA^C,) + iW g - 'wA^CwA^x, (22) 
where the weight matrix w is defined by: 



w = AtC^A 9 = (C + A^NAt- 1 )" 1 . 



(23) 
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Here 1 NA^ _1 is the de-lensed noise covariance matrix, which is equal to the noise covariance N for g = (no 
de-lensing). Under most circumstances, de-lensing has much less effect on the noise than on the CMB signal, because 
instrument noise is a relatively smooth function of I and contains both E and B modes with similar power, ft is 
possible that A~ 1 NA^ _1 96 N if the noise power spectrum contains sharp features; in this case, the approximation 

w w C^j Q « used below may result in a non-optimal, or (in extreme cases) unstable estimator. 

We would like to approximate A~ 1 (dA g /dK\[g]) using the a matrices; this can be done by expanding: 



a: 



dn v [g V 



dA n 



dm[g] dm[g'] 



du} V [g l g' 



g'=g 



dK\[g'] 



(24) 



g'=g 



where the juxtaposition g~ 1 g l indicates composition of the lensing operations: (g~ 1 g')X = g^ 1 (g'(X)). If the lensing 
is very weak we may take (Ony [g~ 1 g'}/ K\[g'])\ g > =g s=s [A; 1 ]^! and (duj\> [g~ 1 g 1 ]/ Ki[g'])\ g >= g ~ 0, that is, the composition 
of lensing operations can be approximated by re-mapping the convergence field and neglecting rotation. In this regime, 
the statistical properties of the convergence field k should not differ greatly from those of the "de-lensed" convergence 
field A" 1 ^; mathematically, this means that we may take A g and C KK to commute. With these approximations, Eq. 
(I22|l becomes: 



C^ K '\A g K,)* i = (A^xjWufCwA^x - Tr 



wafCwA^CgAt- 1 



(25) 



The right-hand side of Eq. (|25|l is our approximation to the likelihood gradient, and the left-hand side is our 
(approximate) prior gradient. Note that the right-hand side evaluated at the correct lensing configuration g has 
expectation value zero, regardless of the choice of weight function; we will therefore choose the slightly sub-optimal 
weight function w = ^ in order to reduce computational difficulties. This leads us to the estimator: 

Cr- 1 (A sK )r = (C^Aj^ttrfCC^jA-^-Tl [Aj-^^of CC^A^ 



(26) 



[This choice leads to some difficulty for low-noise, wide-beam (Ofwhm > 10 arcmin) experiments; see 3VI Al for 
details.] By expanding C g using Eq. I|17f) . an d noting that in the harmonic-space basis, C and C( ,o) are diagonal 
whereas <rf has no nonzero diagonal elements, we convert this into: 



QKK — 1 ; 



(A,*)? = (e^ 0) Aj 1 x)tafCC^A- 1 i-'It 



(o.o) 1 ^ 



(27) 



C. Fisher matrix 

The Fisher matrix is defined as the expectation value of the second derivative of the likelihood function: 



F[ki, «i'] 



d 2 C 
dK,\dnv 



dC dC 
8k? 3kv 



(28) 



where the second equality follows from taking the second derivative (d 2 /dn^dn^) of the normalization condition 
J e~ c T>ic — 1 and noting that the expectation value of any statistic S is (S) = J Se~ c T>ic. (This also shows that F 
has all non- negative eigenvalues.) A similar relation holds for the field-rotation modes uj\. We may thus compute the 
lensing Fisher matrix as the covariance of the likelihood gradient; the easiest method of doing this is to apply Wick's 
theorem to compute the variance of Eq. I|19ll . This yields: 



F\k\,K\i] = 



dC dC 
8k? duv 



= Tr ( Ate; V^Atc^e 



Tr (C 9 -V« 1 CAtC s - 1 A 9 C^t) 



(29) 



and similarly for the components of the Fisher matrix elements involving the field rotation. For simplicity, we compute 
the Fisher matrix at g — (0,0), i.e. the k = lj = point, so that A( 0j o) is the identity. The C and C7 Q s matrices are 
diagonal in the {T, E, B} basis: 












(30) 
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and: 



rr 1 

(0,0) 



/ (C EE + N l EE )/D l 



-cr/Di 



V 







(31) 



where £>, = [Cj T + N? T ){C EE + N EE ) - {Cf E f. 

The overall Fisher matrix is then computed from Eq. (|29|) : 



F[ki,«i/] w ^Tr 



^(0,0) I -l^(0,0) 1 !' 



(32) 



where: 



[f, 



l Jii,-i 2 



[Call 



fri, i-i 2 

47T 



I" 



•laG 



(i T l 2 C£ B cos2a 
1 2 C, TB sin 2a 



liC^cos2a + l 2 C^ -liC£ E sin2a 
(hC EE + \ 2 C EE ) cos 2a {\ 2 C BB - hC EE ) sin 2a 

l 2 Cf B )cos2a 



.(33) 



(hC EE 



hC BB )sin2a {hC B 



BB 



(The matrix fj" is identical except for the replacement 1- — > *1-.) Note that by Hermiticity of C, we have = f*/ ; 

for the individual 3x3 blocks in the harmonic-space basis, [ffji^-b = [f"i]!i i • Also our construction guarantees 

that fj K = dC/dn\ where the derivative is evaluated at k = to = 0. 

It can be verified by explicit matrix multiplication that the computation for F[k, k] here yields the uncertainty in 
the minimum- variance quadratic estimator of Ref. |lf]j |. with one exception: we have computed the Fisher matrix at 
g = 0, hence the denominator of Eq. 1)32(1 contains the unlensed CMB power spectrum plus the instrument noise, 
whereas the equivalent calculation in Ref. |10|| contains the lensed CMB power spectrum plus the instrument noise. 
In the case of quadratic estimation, it is clear that the lensed power spectrum should be used in order to minimize 
the variance of the estimator. Conceptually, this is because the lensing B-modes can be iteratively cleaned from the 
map, thereby reducing the post-cleaning i?-mode power spectrum and reducing the noise in the lensing estimator. 
Our ability to clean the map is bounded, of course, by the sum of the unlensed CMB and noise contributions to Cf B . 



D. Uncertainty in lens reconstruction 



The usual method of estimating the uncertainty in lens reconstruction would be to invert the Fisher matrix. 
This approach is motivated by the Cramer-Rao inequality, which states that an unbiased estimator of the lensing 
configuration must have covariance at least equal to F _1 . Unfortunately, the Cramer- Rao inequality is only an 
inequality, and there is no guarantee that the bound F _1 can actually be reached; indeed this bound is only achieved 
in the case where the likelihood function is Gaussian with curvature F. The traditional justification for assuming 
Gaussianity of the likelihood function is the Central Limit Theorem. This works for studies of lensing of the CMB 
temperature field, in which the typical lensing mode being reconstructed is at I ~ 100 whereas the temperature 
fluctuations that are being lensed have wavenumber I ~ 1000; thus there are roughly (1000/100) 2 = 100 patches of 
primary CMB behind each lensing mode. (Most of the information comes from non-local correlations in the T field, 
so this argument technically requires more justification; nevertheless the calculations in Ref. 01 seem to indicate 
that it gives the correct answer.) This argument does not apply to lensing of the CMB polarization because the 
wavenumbers of the primary E polarization modes and of the lensing field modes (rej) are both at wavenumbers of 
order I ~ 1000. We should therefore be careful of possible problems with the Fisher matrix estimate, Eq. ilML'l of the 
uncertainty in the lensing field. In this section, we outline two such problems that occur in lensing reconstruction: 
first, a complete breakdown of the Fisher matrix approach when the field rotation lu becomes important; and second, 
fluctuations in the curvature matrix resulting from the statistical nature of the primary i?-field. 

Consider first the problem of simultaneous reconstruction of both k and u. (We will see in HI VI that the noise levels 
required for this are not achievable in the near term, however, this extreme example serves to illustrate the problem.) 
One can see that if there are no primary £?-modes, then as instrument noise goes to zero, the uncertainty in k and lu 
obtained by inverting Eq. (|32|l goes to zero. But this cannot be true because the one equation i? U nicnscd = cannot 
be used to solve for the two fields n and to simultaneously. Therefore inverting the Fisher matrix yields a qualitatively 
absurd conclusion. What went wrong? The observation that one equation (the vanishing of the unlensed i?-mode 
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field) cannot be solved for two variables (k and us) yields a clue. Consider the case where instrument noise is negligible; 
then we know that the measured i?-mode is purely caused by lensing: 

B x = V ( -| J [!' ' (1 - + *1' ' ~ !>!'] ^l-i' sin 2a, (34) 

where a = <^>i — 0i-i' . To the extent that the -E-mode is dominated by the primary (not lensing) contribution, 
Eq. I|34|) is a linear system containing 2N unknown variables (the amplitudes of the n and u> modes) but only N 
equations (the knowledge of the lensed B-modes), thus there are degeneracy directions in lens configuration space 
which are unconstrained by the vanishing of the -B-modes. These directions must be constrained by a combination 
of the statistical properties of primary temperature and -E-type polarization, and prior knowledge about the lensing 
field. 

We can now understand why the lensing Fisher matrix F is inadequate for determining the uncertainty in the 
lensing fields k and uj. The curvature matrix: 

^ Ki ' K " ](ff)= 4^ (35) 

has very small eigenvalues in the directions of degeneracy of Eq. I|34|l and very large eigenvalues (approaching oo 
as N — > 0) in the orthogonal directions. But as one can see from Eq. (|34p. the direction of degeneracy depends on 
E and hence on the specific realization of the CMB. If we average over CMB realizations to obtain a Fisher matrix 
F, then we derive F = oo, which does not accurately reflect the non-zero errors in the degenerate directions of Eq. 
(|34|l . Mathematically, the Fisher matrix methodology does not work because the error bars on (k,uj) are extremely 
non-Gaussian. The lesson is that we should be careful about interpreting the inverse of the Fisher matrix as an 
uncertainty in parameters when the Central Limit Theorem does not come to our aid. 

A similar but less spectacular problem occurs in attempting reconstruction of small-scale lensing modes even 
when there is sufficient instrument noise that fl is irrelevant. This is the regime of interest to a near-future high- 
resolution polarization experiment. The statistical uncertainty in the lensing reconstruction is given by the inverse 
of the curvature matrix T . When doing a lens reconstruction, this curvature matrix is augmented by the curvature 
of the prior, C K " ; ~ 1 , so that the posterior error covariance matrix of the lensing reconstruction is approximately 
(J- + C KK_1 ) _1 . We wish to compute the mean squared error in the reconstructed convergence k, which is obtained 
by computing the ensemble average of this covariance matrix over all realizations of CMB, noise, and LSS: 

S"" = {{k - n){k - «)t) iSS » ((jr + C"- 1 )- 1 )^ (36) 

The Fisher matrix F is defined to be the expectation value of the curvature: F = {T) (with no LSS average). If the 
curvature matrix were always equal to F, then it would be permissible to approximate S KK s» (F + C KK . It 
can be shown (see Appendix that the statistical fluctuations of T always increase the uncertainty, Eq. I|36(l: this 
increase we call the "curvature correction." 

Conceptually, the naive calculation that the mean squared error is approximately So = (F + C™" 1 )" 1 suffers 
problems for the same reason that the Fisher matrix calculation for simultaneously estimating k and lu failed: the 
different realizations of the primary CMB introduce fluctuations in T , and when we average over CMB realizations 
we generate a non-Gaussian error distribution for the estimated convergence. 

The actual computation of the curvature corrections is not pursued here; some of the relevant issues are discussed in 
AppendixJU] where we show that the "first-order noise contribution" of Ref. [2lJ arises as one part of the second-order 
curvature correction. 



E. Relation to quadratic estimators 



It is of interest to compare the estimator we have derived, Eq. I|27(l . to the quadratic estimation method of Ref. [Toj . 
The performance of the estimators is compared numerically in illll Bl Here we display the quadratic estimator and 
note the major differences between the quadratic and iterative estimators. The Wiener-filtered quadratic estimator 
is: 



/IKK — 1 C 

L-z 01,1' 



^(quad) 



* (C)lss ct £C(C) L £ S x, 



(37) 



where the quadratic Fisher matrix is determined as: 



(quad) 



i. 



i.i' 



-Tr 



(C)iSS f -l( C )LSS f l^ 



(38) 
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The "unbiased" (to first order in $), non- Wiener- filtered temperature is given by Eq. i|37|) with the "prior term" 
qkk - 1 g omitted. (The quadratic Fisher matrix is not technically a Fisher matrix, but its inverse does give the 
covariance of the unbiased quadratic estimator.) We prove in Appendix that Eq. H37[) and its unbiased equivalent 
are identical to the minimum- variance quadratic estimator that arises from the optimal weighting scheme of Ref. |10| . 
The mean squared error in the reconstructed convergence according to Eq. I|37|) is: 

gKAt (quad) _j_ jp(quad)^ (39) 

Several features of Eqs. (|37[) through (|39|l are readily apparent. First, the estimator Eq. I|37(l is a quadratic function 
of the CMB temperature/polarization field x. Secondly, we note that the uncertainty in the quadratic estimator is 
determined by the quadratic Fisher matrix, which contains the inverse of (C)lss- For statistically isotropic noise, 
this inverse is given by: 



/ (Cf E + N l EE )/D l -Cnb, 



-cnb, (cj T + ND/bt o I . (40) 
\ o o i/{c bb + n bb ) 



where £>/ = Cj T Cf E — Cf E2 , and C, x is the lensed CMB power spectrum (or cross-spectrum): Cf- 



(XiX'^lss- Comparison of the quadratic Fisher matrix [Eq. I|38(l ] to the full Fisher matrix [Eq. i|32|l ] shows 
that the two are identical except for replacement of Eq. (|40|) by Eq. I|31|l . This results in a qualitative difference 
between the two estimators: as instrument noise is reduced toward zero, the full Fisher matrix improves without 
bound (F — > oo), so (aside from foregrounds, field rotation, primary B-modes, and the statistical concerns outlined 
in Hill D|) the iterative estimator should be able to reconstruct the convergence with arbitrary accuracy. This is 
not so for the quadratic estimator, whose reconstruction accuracy is limited by the nonzero value of C BB and the 
resulting upper bounds on (C)^ s and F^ quad '. At high noise levels where the JS-mode cannot be mapped, however, 
C BB + N^ B « C BB + N^ B since both sides of the equation are noise-dominated, and in this regime the performance 
of the two estimators should be nearly identical. 



IV. STATISTICS OF FIELD ROTATION 



Here we investigate the statistics of weak lensing fields with the objective of understanding the importance of 
the field rotation uj (or equivalently the cross-potential il) in CMB weak lensing. Field rotation is a cosmological 
contaminant in the sense that even with noiseless CMB data and no foregrounds, we cannot hope to recover two 
fields k and to from the single equation -B un i e nscd = 0. Therefore a non-zero power spectrum Cf^ translates into an 
uncertainty in the lens reconstruction. We compute the power spectrum Cf^ by considering deflection angles; this 
has the advantage of provi ding a unified treatment of the higher-order Born approximation and "lens-lens coupling" 
effects considered by Ref. (23. We work in the longitudinal gauge because in this gauge the perturbations to the 
metric remain small (of order 10 -5 except in the very small portion of the universe near neutron stars and black 
holes) and so perturbation theory techniques are valid. We then consider the implications for lensing reconstruction; 
for near-term experiments, the effect is seen to be negligible. 



A. Lensing power spectra 

In the flat-sky approximation, we treat the photons as propagating in roughly the — e z direction so that the CMB 
experiment looks in the e z direction; the "sky" is in the xy-plane. The spacetime metric observed by the photon is 
(so long as it does not stray far from the z-axis): 

ds 2 = a 2 (r) [-(1 + 2*)dr 2 + (1 - 24-)(d X 2 + sin| X (dn 2 x + dn 2 y ))] , (41) 

where the Newtonian potential ^ is generated by the non-relativistic matter inhomogeneities, and sin^ x — 
K~ x l 2 svaiK x l 2 x) where K = —£IkHq is the curvature of the universe. The null geodesic equation in this met- 
ric is: 



d fdn . \ m . 

T ~T~ smK * = sm ^ X- 

dr \dr J on 



(42) 
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The initial conditions are n(x = 0) = no and 9 x n(x — 0) = 0. 

The usual method here is to apply the first-order Born approximation to Eqs. 142f) . i.e. we perform the integration 
over the unperturbed photon trajectory. If we integrate forward, we find that: 

h( X ) = n - 2 f W( X ', x) ^(x>(x')) dx , ; (43) 
Jo on 

where W(x' , x) = cot if x' — cot^ X . We may now apply the second-order Born approximation, in which we integrate 
not over the unperturbed photon trajectory but rather over the photon trajectory given by the first-order Born 
approximation, Eq. Ij42(l . Taylor-expanding the result to second order in ^ yields: 

r^c r"x. rye 

n(x)=n -2/ W( X ', X )d^( X ', n Q )d X ' + 4 / / W( X ", X ')W(x', x)^(x', no) • d^{ X ", n Q )d X "d X '. (44) 
Jo Jo Jo 

The convergence and field rotation at radial coordinate x are most easily derived by taking the angular Fourier 
transform of this result. If we compute the deflection angle and perform the {k,uj) decomposition of Eq. ©, we 
derive: 



; [ X W( X ',x)Mx')d X '-2j2V-W-(l-n l X l X W{ X "iX')W{x'iX)*v{x')^-Ax")d X "d X > (45) 
Jo „ Jo Jo 



and: 

a;i = -2^(l'-*l)[l'-(l-l')] T f X W( X ", X ')W( X ', X )^( X ')^-v(x")d x "d x '. (46) 

Jo Jo 

We now turn our attention to the statistics of Eqs. I|45|) and 14611 . We assume that can be described as 
a Gaussian random field because even in the non-linear regime, our line of sight passes through many regions of 
independent density fluctuation and hence non-Gaussianity is suppressed by the Central Limit Theorem. The power 
spectrum is: 

(*h(xx)*h(X2)) = K+hfiCl* (Xi)8{xi ~ X2); (47) 
here the projected potential power spectrum is determined using the Limber equation: 

Cr (X) = (k = — — , x) = 9 -^tf mQ Ht(l + zfP S (k = -J-, 

sm^x V sm KX ) 4/ 4 V sm K x 



Here we have used the 3D power spectra of the Newtonian potential ^ and fractional density perturbation S = pi p — 1; 
these are normalized in accordance with (here q S {'J, 5}): 



P q {k) = J (g(0)g(x))e Jkx ^x, (49) 

so that the logarithmic band power is given by A^(fc) = (k 3 /2w 2 )P(k). The lowest-order contribution to the conver- 
gence power spectrum is given by the familiar result (here xo is the comoving radial distance to the surface of last 
scatter): 

fXo 

Ct K = l 4 W(x,Xo) 2 Cr(x)d X . (50) 



(There are higher-order corrections to C; KK , but we do not consider them here since the purpose of this paper is to 
investigate lensing reconstruction, not to provide a precision theoretical computation of the lensing power spectra. 
Clearly if a sufficiently high-precision measurement of C ; KK is made, higher-order Born corrections should be considered 
in the theoretical interpretation of the power spectrum.) The field rotation power spectrum is given to lowest order 
by: 



rXo p 

cr = 4£(i'-*i) 2 [i'-(i-i')] 2 / d X / 

Jo Jo 



X 



d X ' W(x,XoyW(x',xfcr(x)CrAx')- (51) 
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100 1000 100 1000 

Multipole, L Multipole, L 

FIG. 2: (a) The convergence (upper curves) and field rotation (lower curves) power spectra in the fiducial cosmology. These 
are normalized to Og ear = 1.0 (solid curves) and <jg mcar = 0.7 (dashed curves), (b) The power spectra of AE (solid curves) 
and AB (long dashed) in "noise units" (/iK arcmin). The short-dashed curves are the total B-mode power introduced by the 
convergence component. The upper curves are calculated for erg ear = 1.0, the lower curves for 0.7. 



Note that the lowest-order (in the Born expansion) contribution to Cf u comes from the trispectrum of the density 
field. If the density field is non-Gaussian and this non-Gaussianity is insufficiently suppressed by the Central Limit 
Theorem, then Eq. (|51|1 will also contain a term from the connected trispectrum (\I f \l/\I'\I r ) connected- However, because 
the factor W(x" , x') m Eq. (|46|l vanishes as \" — > \\ it follows that the trispectrum components contributing to Cf^ 
involve correlations between points at widely spaced radial coordinates, which are suppressed. (Conceptually, this is 
because a single-screen lens only produces convergence and not field rotation, regardless of its Gaussianity or lack 
thereof. Thus if structures at different radial distances are independent, as assumed in the Limber approximation, 
then there is no connected contribution to Cf u .) 



B. Effect on lensing estimation 

We have computed the field rotation power spectrum, Eq. (|51|l for our fiducial cosmology using an analytic 
approximation to the growth factor [2j| and a nonlinear mapping of the power spectrum [24[ . The results are plotted 
in Fig. I^a). 

The effect of the field rotation on the lens reconstruction is to add an additional term to the CMB given by 
Ax = Vx • *VJ7. The power spectrum of Ax is given by: 

i^-L[^. ( i-iO] 2 crcn, 

i^i[*l'.(l-l')] 2 CrQ T -f'Cos2a, 
I £ • (1 - \')fCr (Cfl cos 2 2a + Cff, sin 2 2a) , 

\ E jr^ v ■ (1 " V)]2(Jr ^ sin2 2a + Cl - f ' cos2 2a) ' (52) 

where a — <j)\i — <p\. The field rotation is forbidden to have first-order correlations with the primary CMB and the 
convergence (Cf^ — Cf u = Cf w = 0) by parity; higher-order correlations with the primary CMB will be highly 
suppressed because to is determined by small-scale fluctuations in density along the line of sight with window function 
that vanishes at the last-scattering surface. There are non-vanishing higher-order correlations between k and ui, but 



qATAT 



(jATAE _ 



(jABAB 
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we do not investigate these here. [But note that by reducing the conditional covariance (w 2 )|$ — ((w)|$) 2 , these 
correlations may enable us to reduce the "noise" due to field rotation further.] 

The Ax power spectrum [Fig. EJb)] shows that the w-induced modifications to the CMB _B-modes are of the same 
order as instrument noise when the latter is reduced to (Nj 33 ) 1 / 2 i=s Afpe 1 ~ 0.2 fiK arcmin. [Since we are trying 
to set -Buniensed = 0, contamination in the i?-modes is more serious for lensing than contamination in the -E-modes; 
this is made mathematically explicit by multiplication by C^, 1 ^ in Eq. (51) .] Since Ax has vanishing first-order 
correlation with x, one might conjecture that the field rotation begins to interfere with lensing when the noise Afp is 
reduced to ~ 0.2 fiK arcmin; however, Ax is highly non-Gaussian and exhibits many higher-order correlations with x, 
so we should be cautious of trusting this conjecture. In the simulations f flVI Ajl . we find that even for our Reference 
Experiment F with 0.25 /zK arcmin the field rotation does not significantly contaminate the reconstruction of the 
convergence field - it increases the mean squared error of the reconstruction by only ~ 15%. We conclude that (at 
least at the level of the experiments considered here) the field rotation is not a problem for lens reconstruction. 



V. ESTIMATING THE CONVERGENCE POWER SPECTRUM 



Having investigated the reconstruction of the lensing field, we turn our attention to the convergence power spectrum, 
or equivalently the potential power spectrum, since the two are related by C ; KK = j£ 4 C* *. In this section, we will 
ignore any complications associated with the field rotation as these are likely to be small for near-term experiments. 
In W Al we integrate the likelihood function for the convergence to yield the "grand likelihood function" for the 
lensing power spectrum; since this results in a functional integral over lens realizations, we simplify the problem by 
introducing a Gaussian approximation. We make further approximations in W Bl to yield an estimator that is suitable 
for actual computation. 



A. Likelihood function and Gaussian approximation 



Our basic approach, modeled after Ref. [l7j . is to compute the grand likelihood function C, which is a function of 
the lensing power spectrum: 



£[Cn = -\njvn e -e(K)-C( K ) = 1 lndet QKK _\ n J Vk exp 



— /c t C' t ' t - 1 K- C(k) 
2 v ; 



(53) 



The objective of this section is to develop formalism to compute the minimum of C. A "practical" version suitable 
for numerical computation will be given in W Bl 

The integral in Eq. (|53() has one dimension for each lensing mode and hence cannot be performed by any brute-force 
technique. In this situation the preferred solution is usually to use a Markov chain; unfortunately, the integral has 
of order 10 6 dimensions, and the integrand is expensive to compute, hence Eq. 1)53(1 does not appear to be solvable 
by Markov chains either. We therefore choose to approximate Eq. I|53|) as a Gaussian, in which case the functional 
integral can be computed exactly: 



c[cn 



lndet C h 



In det [JF(k;*) + C KK ~ 1 ] 



K — 1 , 



£(«*) 5 



(54) 



where is the point where C + ^k'C kk ~^k is minimized, and is the curvature matrix, Eq. I|35[) . evaluated at 

the lens configuration g — («;*, 0). 

A grand likelihood gradient Ti can then be defined: 



dC 

dCf K 



-Tr 



_ x dC KK 



-Tr 



i 

—i 

2 



Tr 



p/tK — 1 £1 K K. — 1 

QfJKK 



dh 



d_ 

dn 



1 



- 1 ac + £(k)^| 



(55) 



Note that and hence J-"(/t*), do not depend on C ; KK except implicitly through «*. Also, if we use that k* is the 

maximum of ^K,' ! G Kli ~ 1 K + C with respect to k, we find that the final derivative with respect to k in this equation 
vanishes. We further note that dC KK /dC^ K is simply the projection operator onto the I representation of SO(3), i.e. 
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in harmonic space it has l's as diagonal elements with multipole I and O's everywhere else. Defining di to be the 
number of modes of multipole / (note that on the sphere, di = 21 + 1), this allows us to simplify Eq. I|55|) to: 



di 



E 

1: |l|=i 



l )- r 



1.1 



2Cf 



E 

1: \l\=l 



2C*, KK 



9k* 



-Tr 



<9k 



(56) 



Here the sums are over all modes 1 of multipole I. 

It sometimes occurs that we wish to estimate the lensing power spectrum not by estimating the individual Cf K , 
but rather by "binning" the power spectrum. This is useful if, e.g. the (S/N) 2 per multipole is low or if the partial- 
sky nature of a survey causes confusion between power in neighboring multipoles. In this case, we introduce "basis 
functions" {Ai 11 } for the lensing power spectrum: 



cr = E c ^f; 



(57) 



the coefficients are now to be estimated. The maximum-likelihood estimator is then the choice of that satisfies: 



(58) 



B. Practical estimator and uncertainty 

Ideally, we would like to implement the maximum-likelihood estimator for the coefficients c^, i.e. Eq. I|58|l . 
Unfortunately, this involves setting to zero some linear combination of the IYs given by Eq. I|56[) . which is a highly 
non-trivial task. We therefore take the approximation that the curvature matrix T does not depend on k, then T; is 
seen to depend only on the quantities C ; KK and: 

*' = 2C^ E l K *i| 2 ' ( 59 ) 

* 1:|1|=J 

which explicitly depends on Cf K but is also implicitly a function of Cf K through its dependence on k* . Note that the 
functional form of T t is T^Cf", in) = r ; (0) (C ; KK ) - v t . Eq. © then reads: 

= rW(c v )-v tl (c v ), (60) 
l 

where we have defined v^(c u ) = J2i-M-fvi(c u ). We are thus attempting to solve v^(c u ) = but is some 

complicated function of the convergence power spectrum coefficients {c u }. We solve this problem by approximating 

Ijt (c„) ~ (Vfj,(cu)) lss[c v ], i-e. the expected value of v p (c„) where the LSS realizations are drawn from a lensing 
convergence power spectrum Cf K — ^2 c v M.\. We therefore use the estimator: 

V,,,{Cu) = (v,i(c v )) L sS[c„] VjLt. (61) 

Eq. I|61|) is somewhat abstract, so we clarify its meaning here. The statistic f M (c„) is proportional to the power 
spectrum of the iterative convergence estimator obtained by solving Eq. 1)27(1: this depends on the prior power 
spectrum Cf K = ^„ c v J^A\ as well as on the data. The solution {c u } to Eq. I|61[) is the set of power spectrum 
coefficients for which equals its expected value (which is most easily determined via Monte Carlo simulation) . This 
approach has the advantage of "calibrating out" the noise biases discussed by Ref. [2lJ ■ (Note that some convergence 
modes - those corresponding to large eigenvalues of the curvature T - are reconstructed better than others. What is 
especially useful about w M , or equivalently the power spectrum of the iterative estimator, is that the iterative estimator 
filters out the poorly reconstructed modes. Thus the convergence modes that are reconstructed more accurately are 
weighted more heavily in determining v M and hence in determining the convergence power spectrum.) 

Finally, we wish to determine the uncertainty on the solution {c v } to Eq. H61|) . If we average over many convergence 
modes, then this uncertainty should be given by the inverse of the grand Fisher matrix ( G )F for power spectrum 
determination: 



o = J2 M i rf^cn-viM 



(62) 
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i.e. (G) F is the covariance matrix of t> M . If the reconstructed convergence can be approximated as a Gaussian 
random field (which is true in the case where the reconstruction has high signal-to-noise ratio since in this case the 
k* « k, which is Gaussian because k is produced by many LSS fluctuations along the line of sight), then we can take 
the Gaussian approximation to Eq. (|62|) . This is obtained by considering the covariance of i>i according to Eq. H59|) 
and using Wick's theorem; this yields: 

< 0)F ,.,, = ? *«(P) 2 . m 

We remind the reader once again that the approximation Eq. I|63[) to the power spectrum estimation uncertainty is 
only valid if the reconstructed convergence field is approximately Gaussian. If k* has a significant trispectrum 
when averaged over LSS+CMB+noise realizations, then Eq. I|62l) must be used instead. This is only a problem in 
the low signal-to-noise (high I) regime in which lensing modes cannot be reconstructed individually and their power 
is only statistically detected. 



VI. NUMERICAL SIMULATIONS 



Throughout our derivation of lensing and tensor power estimators, we have made various approximations that 
should be tested. The most robust way to do this is to conduct a numerical simulation of the CMB and lensing field, 
and then construct lensing estimators, comparing the error to the theoretical estimates of Eqs. <|32ll and (|38[l . In all 
cases, we have used a flat sky with toroidal boundary conditions. We will only simulate the CMB polarization here; 
formally, the polarization-only estimators are obtained by setting Afr — oo in the relevant equations. 



A. Reconstructing the convergence 



The simplest simulations involve reconstruction of the convergence k. We generate simulated CMB T, Q, U, and 
K fields on a 34°08' square patch of sky with resolution 1 arcmin per pixel (2048 x 2048 pixels); lens the simulated 
CMB; and add appropriate noise. 

We wish to compare the quadratic estimator, Eq. I|37l) with our new estimator, Eq. 127fl . The former is relatively 
straightforward to compute; the latter requires that we apply the methods of WI Bl We simulate Gaussian random 
realizations of the Q, U, and k fields, perform the lensing re-mapping, and add appropriate noise. We then compute 
the estimators of Eqs. i|37|) an( l H27fl . There are two tricks that are very useful in numerical computation of these 
estimators: first, simultaneous computation of inner products V a^u for all 1; and second, stochastic trace computation. 
We discuss each of these here. 

The simultaneous computation of inner products was introduced by Refs. 0, ITol ] in order to compute quadratic 
estimators. A general version of this is (on a flat sky; see Ref. [25| for an all-sky version): 

J2 (tVVK 1 '" = Yl V ft -[^(n)V fl nx(n)]. (64) 

l ^ ' Xe{T,Q,U} 

(Note that this equation requires that t and u be written in the {T, Q, U} basis since E and B have different 
transformation properties under lensing. Also the asterisk on tx is of course unnecessary if t is a real field.) If t and 
u are expressed in real-space, then the right-hand side is easily evaluated. The quantities ^Pt^a^u are then obtained 
via a fast Fourier transform; division then trivially removes the I 2 /2. There is a zero-wavenumber mode corresponding 
to Z = which presents a problem for division. Here we simply set this convergence mode to zero; in the complete 
all-sky treatment this would be justified by noting that the convergence is — | times the divergence of the deflection 
angle vector, hence J KcPh — and so k; = o = 0. [There is a corresponding trick for the field rotation: Eq. I|64|) 
remains valid if we make the replacements cr 1 K — > and V„«x(fi) — > *Vn«x(n).] 

The trace in Eq. Q27[) is most easily evaluated stochastically: if we generate a random noise vector r\ with covariance 
N, then the trace is equal to the expectation value: 



Tr 



A r ic roVr c cro!o)V N 



= ((C^A^VfCC^A^). (65) 



If this Monte Carlo method is used to compute the trace, then the Monte Carlo error in its computation for one 
realization of r\ is less than or equal to the instrument noise contribution to the uncertainty on the right-hand side of 
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FIG. 3: The power spectrum of the error in the convergence reconstruction for Reference Expts. A-F. The top curve in each 
panel shows the overall convergence power spectrum Cf . The middle curve shows the theoretical, i.e. from Eq. 1381 power 
spectrum of the convergence error k — k in the Wiener-filtered quadratic estimator Eq. I(37( : the "+" data points indicate 
the power spectrum of this error as recovered from simulations. The error power spectrum for the iterative estimator Eq. 
166(1 . again as recovered from simulations, is shown with the "x" data points. The bottom curve shows the theoretical best 
performance if the Fisher matrix limit Eq. 13211 can be achieved, i.e. if we had a truly optimal estimator and no curvature 
corrections. Note the more dramatic improvement provided by the iterative estimator when the noise is small. Field rotation 
was neglecting in the calculations for this figure. 



Eq. ((27(1 . [This is because the right-hand side of Eq. (|27|) is a quadratic function of x, with covariance C, which is 
greater than the noise covariance N along all directions.] Since the Monte Carlo error variance scales as the reciprocal 
of the number of realizations of rj used, it follows that of order a few realizations of rj are sufficient in evaluating Eq. 
((65(1 . In fact for the Reference Experiments described here, we find that there is little gain in taking more than one 
realization of r/. 

We solve Eq. ((27(1 using the iterative procedure: 



CiiK 1 } 



fiKK J ( C~* — 1 



,A- 1 x) t cr 1 ^CC7 n 1 n ,A- 1 x - Tr 



} + (l-0K,i. (66) 



Here g is the lens configuration with convergence k and no rotation: g = (k, 0), and the Q are convergence parameters; 
we choose them to be: 







C(c 



(67) 



where £( c ) is a constant satisfying < C( c ) < 2- It is found that the Wiener-filtered quadratic estimator, Eq. 137(1 . is 
a good choice for initializing this iteration. The choice of C( c ) is an intricate issue: if it is set too small, the rate of 
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FIG. 4: A simulated reconstruction of the lensing convergence using polarization and Reference Expt. C parameters. In the left 
panel, we display the realization of the convergence field n used to produce the simulated CMB. The reconstructions using the 
Wiener-filtered quadratic estimator and the iterative estimator are shown in the center and right panels, respectively. These 
frames are each 8° 32' in angular width, corresponding to 1/16 of the simulated area; the scale ranges from black (diverging, 
K = —0.12) through white (converging, k = +0.12). Although all lensing multipoles up to / = 3600 are simulated, we have only 
displayed the I < 1600 modes in these figures for clarity. Field rotation was neglected in the calculations for this figure. 



(a) Dependence on noise (0 FW hm = 4 arcmin) (b) Dependence on beam (N P = 1 .41 uK arcmin) 
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FIG. 5: The dependence of the mean squared error in lensing reconstruction, — ^i| 2 ), on the instrument parameters. The 
baseline is Ref. Expt. C, Mp = 1.41 fiK arcmin, 9fwhm = 4 arcmin. The thick solid line is the raw power spectrum (7f R ; the 
thin solid lines indicate the mean squared error for the lensing reconstruction using the iterative estimator. As described in the 
text, the iterative estimator is unusable for wide-beam experiments (>10 arcmin); we used the quadratic estimator for these 
cases (dot-dashed curves). The dashed lines indicate the ideal zero- noise reconstruction error from the quadratic estimator 
according to Eq. 1381 with polarization only (top) and temperature+polarization (bottom), (a) Changing Mp; units are fiK 
arcmin. (b) Changing Bfwhm; units are arcmin. 



convergence of the iteration becomes unacceptably slow; if it is set too high, the iteration can fail to converge entirely. 
The convergence can be understood by approximating Eq. 166(1 as linear in K n : 

<+!,! « ClCriH^n - + (1 - 0)«n,l! (68) 

here we have approximated the response matrix of the likelihood gradient using the curvature matrix. Then the 
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(a) Quadratic estimator (b) Iterative estimator 

' ' i 1 1 i ' ' i — ■ 1 




500 1000 
Multipole number, L 



1500 




500 1000 
Multipole number, L 



1500 



FIG. 6: The correlation coefficient p — (nk) / (k 2 ) 1 ^ 2 (k 2 ) 1 ^ 2 between the estimated and reconstructed lensing convergences as 
a function of multipole I, as determined in simulations. The correlation coefficients for the quadratic estimator are shown in 
panel (a); those for the iterative estimator are shown in panel (b). In both of these panels, the eight curves are for Reference 
Expts. A, B, C, D, E, and F (bottom to top) from Table[I] Field rotation was not included in the calculations for this figure. 



requirement for convergence is that all of the (possibly complex) eigenvalues of the matrix: 

Ri,v=S l ,v-C ic) c 1 i- l+Fr (69) 

lie within the unit circle. Note that, averaged over CMB+noise realizations, (T) = F, and hence (R) = (1 — C(c))l> 
hence we conclude that the iterative procedure should be convergent for < C(c) < 2 in the absence of curvature 
corrections. In reality, very small values of Q c \ may be necessary for convergence, especially in cases where curvature 
corrections are large. Since .F + C KK_1 is positive definite at the maximum posterior probability point, there is always 
a positive value of £( c ) that results in convergence. The cases in which the small values of £( c ) are required are those in 
which curvature corrections are large; we have found from our simulations that these are the low-noise experiments. 
Convergence is generally found to be faster for the high I convergence modes. 

One problem we have encountered is that for experiments with low noise and wide beam (Opwhm > 10 arcmin), 
the iterative estimator given by Eq. I|66(l is unstable. This instability arises because the noise Ni is strongly blue; 
hence the de- lensing operation A" 1 in Eq. 1|66[) mixes high- multipole noise down to lower multipoles where it disrupts 
the lensing estimation. This problem is in principle solvable by using the correct (C + A~ 1 NAt ~ 1 )~ 1 weight function 
in place of C^ 1 ^ in Eq. \23\i . However, since this occurs in a regime where the iterative approach does not improve 
upon the quadratic estimator approach anyway, we recommend simply using the quadratic estimator for wide-beam 
experiments. 

We illustrate by considering the reconstruction of lensing using Reference Experiments A-F. The residual error in 
the reconstructed convergence k, as measured by computing the power spectrum of the difference k— k between input 
and reconstructed convergence maps, is shown in Fig. [3]for both quadratic and iterative estimators. For the iterative 
estimator applied to Ref. Expt. C, we set & c \ = 0.12 in Eq. ijrjrjjl . used three realizations of the rj field in Eq. ijfjo")) . 
and performed 64 iterations. Ref. Expt. F has a lower noise level and so it is necessary to use the smaller convergence 
parameter £( c j = 0.04; the convergence is thus slower and we used 256 iterations. Ref. Expt. A has a higher noise 
level and so we can use C(e) = 0.2 and 24 iterations. Maps of the input and reconstructed convergence fields for the 
Ref. Expt. C reconstruction are shown in Fig. 01 The dependence of the iterative estimator reconstruction accuracy 
on noise Np and beam size (FWHM) Qfwhm is shown in Fig. [3J We have also displayed in Fig. |S]the (theoretical) 
reconstruction error curves for the quadratic estimator in the absence of instrument noise. These curves represent 
the fundamental limit to the reconstruction accuracy possible with quadratic estimators; it is readily seen that the 
iterative estimator can do better if noise is low (A/p < 0.5-1 fjK arcmin, depending on the range of I considered). 
[Note that we display C Z KK in these plots, whereas some authors have displayed instead 1(1 + l)C dd /2ir, where d = V$ 
is the deflection angle. The two are related by 1(1 + l)C dd /27r = (2/ir)C£ K ] 
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FIG. 7: The effect of field rotation on lensing estimation for Ref. Expts. E and F. (a) Power spectra of the convergence n (thick 
solid line), convergence error k — k with field rotation (thin solid lines), and convergence error k — k without field rotation 
(thin dashed lines), (b) The worsening of the reconstruction due to the presence of field rotation, as measured by the ratio 
of power spectra of the convergence errors: (\(k — K)i| 2 ) w ith — H)l | 2 )without u . The same CMB, LSS, and noise (scaled 

appropriately to the experiment) realizations were used for all the simulations in this figure. 



The accuracy of reconstruction can also be represented by the correlation coefficient pi = Cf K / 'yC ; KK C ; KK The 

correlation coefficient is the figure of merit if the objective is to cross-correlate the convergence from CMB lensing 
with another tracer of the density (e.g. weak lensing of galaxies), since the signal-to-noise ratio of the cross-correlation 
is determined by pi. We have plotted the correlation coefficient in Fig. Elfor the various Reference Experiments. The 
iterative estimator offers improved reconstruction, especially for the lower- noise experiments (C-F). 

Up until this point we have neglected the field rotation w; we should verify that this is justified. We do this 
by introducing field rotation with power spectrum given by Eq. (|51|1 as computed in 3IVBI with normalization 
Og ear = 0.84. We then compare the performance of the iterative estimator, Eq. Ijfifijl. with and without the field 
rotation. The comparison is shown in Fig. \7\ it is seen that the field rotation increases the mean squared error of the 
reconstruction by only ~ 5% for Ref. Expt. E (0.5 pK arcmin noise, 2 arcmin beam) and ~ 15% for Ref. Expt. F 
(0.25 pK arcmin noise, 2 arcmin beam). 

As a final note, we find that for low noise levels, a large number of iterations is required because our iterative 
process is ill-conditioned. Indeed, it is possible that there are eigenvalues of R that are so close to unity that their 
corresponding modes have not converged even after tens or hundreds of iterations; if this is the case, then it should 
be possible to improve upon our results by increasing the number of iterations, or by finding an iterative scheme that 
converges faster than Eq. I|6bf> . This is allowed by the Fisher matrix noise limits, which are significantly lower than 
the achieved noise levels (see Fig. |3J). We consider this possibility unlikely since we tried increasing the number of 
iterations in several of the simulations and found little improvement. Additionally, modes with eigenvalue Ar close 
to unity correspond to flat directions of the curvature matrix T [see Eq. I|69|) ]: such directions, however, cannot be 
reconstructed accurately regardless of how many iterations are used. 



B. Extracting the convergence power spectrum 



We compute the lensing power spectrum from simulated data by solving Eq. (|61fl . The approach, once again, is 
iterative: we adjust the power spectrum Cf K until = (v^lss- [Note that both the left and right sides of Eq. 
(IFTH depend on Cf K .] We will attempt here to compute the binned power spectrum, i.e. we choose a basis for the 
convergence power spectrum given by: 

pAl < I < (m + 1)AZ, (7Q) 
otherwise, 

where Al is the bin width and p ranges from through Nb in — 1. We are thus attempting to reconstruct the power 
spectrum in Nbi n bins, equally spaced out to maximum multipole Z max = Nu n Al. 
Our iterative algorithm for solving Eq. i|bd|) is: 
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2(w Al (c J/in )) LS s[ Cw n] V (« M (c I/)n ))i55[ Cl/in ] 



1 



(71) 



where n represents the iteration number, and is the number of modes that fall into the /zth band. It is readily 
apparent that the final values c v<00 will satisfy — (v^lss- 

In order to compute the convergence power spectrum estimator, the expected value {v^(c v . n )) lss[c v „] must be 
determined; the simplest method for doing this is via Monte Carlo simulations. Since in the end we are solving the 
equation = (v^)lss, we want to make sure that the Monte Carlo-induced error in the right-hand side of this 
equation is small compared with the statistical error in the left-hand side (which depends only on the data and on 
Cu,n)- It is trivial to see that after Nmc Monte Carlo simulations, the variance in determination of the right-hand 
side is 1/Nmc of the statistical variance in the left-hand side. Therefore, we expect that if we use Nmc Monte Carlo 
simulations to determine (v^lss, then the variance of our determination of the convergence power spectrum will 
increase by a factor of 1 + 1/Nmc- A reasonable choice, then, is to take Nmc = 3, which results in 15% increase 
in the variance of the power spectrum estimator over the case of Nmc = (exact computation of (i> m )lss)- The 
uncertainty in the power spectrum estimation can then be estimated from Eq. (16211 with the correction for Nmc'- 



(Note that this is the standard Gaussian formula for error bars, except that it is corrected for Nmc and is written in 
terms of the filtered power spectrum Cf* K * instead of the noise power.) 

In Fig. Ufa), we show a determination of the convergence power spectrum from simulated data using Ref. Expt. 
C noise parameters. The choice of bins was Nbi n — 32, Al = 50, ^ max = 1600, and the survey area was 0.355 
steradians (2048 arcmin x 2048 arcmin, with toroidal periodic boundary conditions). We initialized the power 
spectrum estimation with the white spectrum Cf K = 1 x 10~ 9 , corresponding to c M ,o = 10~ 9 . We used = 0.5 for 
the first two iterations of Eq. (|71() . which are sufficient to bring the estimated power spectrum c^ >n= 2 to the correct 
order of magnitude. Once this "ballpark" estimation has been completed, we used Q< v ) = 1 for the subsequent (n > 2) 



An examination of Fig. EJa) shows that the power spectrum estimator Eq. (|71|l has been successful in reproducing 
the qualitative features of the power spectrum; however, the power has evidently been overestimated at the high-Z 
end. We can perform a quantitative analysis of the performance of the power spectrum estimator using the \ 2 test, 
using the Gaussian error estimate of Eq. (|72|l . The \ 2 value for the I < 1000 region (where the power spectrum 
determination should be cosmic-variance limited) is % 2 = 25.46 for 20 degrees of freedom (p = 0.18), indicating that 
Eq. (|72l) appears to be giving a reasonable estimate of the uncertainty on the power spectrum. 

The same is not true of the high-i region 1000 < I < 1600, for which we compute % 2 = 53.05 for 12 degrees of 
freedom (p — 4 x 10 -7 ). It is readily apparent from Fig. [H]that the failure of the x 2 test is due to an upward bias in the 
power spectrum estimator Eq. (|71|) . This bias occurs because, regardless of c M , our power spectrum estimator assumes 
that there is no convergence power at I > 1600. However, it still detects the B-modes induced by this short-wavelength 
convergence power, and introduces excess convergence power at I < 1600 to reproduce these i?-modes - hence Cf K 
is overestimated. This bias can be removed by either estimating the modes to higher I or including the aliasing of 
power from those modes into the modes we estimate as an additional source of noise j2|| . This explanation of the 
upward bias is confirmed by Fig. IBIb), in which we have artificially "turned off" the lensing effect for convergence 
modes at I > 1600, then produced a simulated data set and applied the power spectrum estimator Eq. I|71|l . In order 
to make the comparison between the original simulation and the restricted (i.e. short-wavelength lensing turned off) 
simulation as simple as possible, we have used the same CMB, noise, and LSS realizations for both. One can see 
by comparing Figs. [SJa) and[S{b) that there is little effect at low I, where the power spectrum estimation is limited 
by cosmic variance. However, at high /, one can see that the bias present in the original simulation has disappeared 
in the restricted simulation, thereby confirming that the bias was due to high-Z convergence power. The \ 2 f° r the 
1000 < I < 1600 range has been reduced to \ 2 — 31.36 (12 degrees of freedom, p = 1.7 x 10~ 3 ), which is still 
indicative of underestimation of the uncertainty in the c^. Thus we conclude that in this regime, either the Gaussian 
error estimate Eq. l(72^l is underestimating the error by a factor of ~ y / 31/12 « 1.6, or the error bars are correlated, 
or the iteration of Eq. (fTTJl has not completely converged. 

In a real lensing experiment, the underlying primary power spectrum C, is unknown and only the lensed power 
spectrum is directly observable (and even our knowledge of this is limited by instrument noise and cosmic variance) . 
Thus a slightly more complicated version of the above analysis will be necessary to simultaneously solve for Cf E 
and Cj KK . (Although since in the regime we are examining, I < 3000, the E power spectrum is dominated by 




(72) 



iterations. 
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Convergence power spectrum estimation 



a) Full simulation 
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(b) With L<1600 convergence modes only 
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FIG. 8: (a) Simulated convergence power spectrum estimation from Ref. Expt. C with solid angle 0.355 steradians. The solid 
curve is the fiducial model Cf re ; the points are the convergence power spectrum measured from simulated data after 5 iterations, 
(b) The same, except that the I > 1600 convergence modes were ignored in producing the simulated data (however, exactly 
the same CMB+LSS+noise realization was used). The horizontal error bars indicate the widths of the bins, while the vertical 
error bars are the la measurement uncertainties according to Eq. <!72t . Note that the vertical error bars include the Monte 
Carlo error associated with using Nmc = 3 simulations to determine (v^); if we had calculated (u M ) exactly (Nmc 3> 1), the 
vertical error bars would be 13% smaller (see text for details). 



primary anisotropies rather than lensing, we do not expect a degeneracy between these two quantities.) It will also be 
necessary to estimate the convergence power spectrum well beyond the region of interest in order to avoid the upward 
bias described here. Since the signal-to-noise ratio at high I is low, it will be necessary to use wider bins (i.e. larger 
AZ) in this region. The choice of exactly which bins to use must be determined by the characteristics of the specific 
experiment. 



VII. CONCLUSIONS 



Weak gravitational lensing of the CMB allows us to reconstruct the (projected) mass distribution in the universe, 
thereby probing large-scale structure and its power spectrum. Since the window functions for lensing peak at redshift 
z of order unity, lensing offers the possibility of using the CMB to study the low- redshift universe 0, 0, 0> 0> 01 • 
"Cleaning" of lensing from CMB maps is potentially valuable for studying the primary CMB, particularly for infla- 
tionary gravitational wave searches using the low-Z £?-mode polarization |lllll2j |. Since the primary CMB polarization 
is expected to contain only E- modes on the relevant angular scales (Z of order 10 3 ), while lensing transfers some of the 
CMB polarization power into -B-modes 0, all -B-modes that we see on these scales are due to lensing (or foregrounds). 
Thus the CMB -B-mode polarization allows much better lensing reconstruction than is possible using temperature 
data alone. It is thus of interest to consider optimal methods of reconstructing the lensing field from CMB polarization 
data; in this paper, we have investigated this problem in detail and improved significantly on the previous quadratic 
estimator methods [l(J . We have shown that this improvement can be up to an order of magnitude in mean squared 
error over the zero-noise reconstruction error for the quadratic estimator. 

We make several comments concerning the present calculations. First of all, our lensing estimator, Eq. I|21|l . 
while statistically superior to the quadratic estimator, still does not achieve the Cramer-Rao bound on reconstruction 
accuracy. We have argued that this results in part from "curvature corrections," fluctuations in the curvature matrix 
that render the Cramer-Rao bound impossible to achieve (more generally, this should also serve as a warning against 
blindly assuming that the statistical errors in any measurement are given by F -1 .). We expect that our lensing 
reconstruction estimator is near-optimal since it is an approximation to the maximum-likelihood estimator and our 
iterative estimator shows no signs of incomplete convergence, however the possibility of further improvement has not 
been ruled out. 
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Secondly, we have assumed negligible primary B mode polarization here (although the formalism described herein 
is trivially modified to include significant primary B- modes, the results would be qualitatively different). In the 
absence of vector or tensor perturbations, this is correct; if vector or tensor perturbations are present, then one must 
consider their effect on lensing reconstruction. In the case of inflationary gravitational waves, primary B-modes are 
generated mostly on very large angular scales; the arcminute-scale anisotropies used for lensing reconstruction are 
uncontaminated [Icj . (Formally, if we were doing a lensing reconstruction with the objective of cleaning lensing 
contamination of the tensor- induced reionization bump at I < 20, we would set N^ B = oo for I < 20 so that the 
lensing reconstruction does not remove tensor £>-modes.) A more rigorous investigation of the effect on inflationary 
gravitational wave searches is deferred to future work. 

Thirdly, the real CMB is contaminated by foregrounds - an important issue for all CMB anisotropy experiments. 
One advantage of using CMB polarization for lensing reconstruction is that whereas the small-scale CMB temperature 
field is heavily contaminated by scattering-induced secondary anisotropies such as the thermal and kinetic Sunyaev- 
Zel'dovich effects, Ostriker-Vishniac effect, and patchy reionization, these effects are much smaller for polarization 
|27j . However, polarized point sources and galactic foregrounds are still a serious concern. These have very different 
frequency dependence than the blackbody fluctuations characteristic of the CMB, and this property has been exploited 
to remove them; unfortunately, their fluctuation spectrum, degree of polarization, non-Gaussianity, and variations in 
frequency dependence are poorly understood. Galactic foregrounds do not correlate with the cosmological signals, and 
in this sense the residuals from their subtraction act like instrument noise contaminating the -B-mode (and, to a lesser 
extent, E-mode) polarization. The foreground power spectrum is likely to be different from that of instrument noise 
and is variable across the sky; nevertheless, if the covariance matrix of the foregrounds (or residuals after foreground 
subtraction) can be determined, then we can add the foreground covariance to the instrument noise covariance matrix 
N. (If the statistical properties of the foreground residuals cannot be determined or at least constrained, then any 
cosmological analysis is pointless regardless of the methods used.) Polarized point sources produce Poisson noise; also 
since many of them are extragalactic, one could be concerned about their correlation with LSS and hence the lensing 
signal. We leave a detailed study of foregrounds and their impact on lensing reconstruction to future investigation. 
We note that the predicted levels of foreground contamination from dust and synchrotron galactic emission are at a 
level of a few /iK arcmin prior to any frequency cleaning [28j , comparable to the noise levels discussed here. Frequency 
cleaning should reduce this, at the expense of amplifying instrument noise. If foreground removal is inadequate, this 
may result in anomalies in the final results such as unphysical correlations between the convergence maps and CMB 
polarization, variation of the convergence power spectrum between "clean" and "dirty" portions of the sky, correlation 
of the convergence maps with synchrotron or dust emission, etc. 

In summary, we have shown that taking into account the full likelihood function allows improved reconstruction 
of the lensing of the CMB polarization field over that achieved by quadratic statistics. For purposes of computing 
the lensing power spectrum or cross-correlating CMB lensing with another tracer of the cosmological density field, 
the most important improvement is at high I where earlier approaches do not reconstruct the convergence at high 
signal-to-noise. (At low I, the reconstruction is already cosmic variance limited.) If one's objective is to clean out 
the lensing effect in search of primordial gravitational waves, then the relevant quantity is the residual error in the 
reconstruction, and it is important to reduce this even if the convergence has been mapped at high signal-to-noise; 
hence improvement at all multipoles is useful. We conclude that the likelihood-based estimators developed here offer 
the best prospective so far to extract the full amount of information from future high-resolution CMB polarization 
experiments. 
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APPENDIX A: QUADRATIC ESTIMATOR 

In our simulations, we have compared the error of our iterative estimator, Eq. I|27|) . with that of the quadratic 
estimator, Eq. I|37|l . Here we show that the latter estimator corresponds to the optimally weighted quadratic estimator, 
as proposed by Ref. Statistically isotropic noise is assumed throughout. 

We begin by expanding Eq. I|38|) using the formula for fj K given by Eq. (|33|) . The off-diagonal elements vanish by 
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symmetry, while the diagonal elements are: 

F ( qU ad) = I^Tr{[P 1 ]_ ll , l2 [(C) LSS ] i - 1 [fr], 2 .-, 1 [<C) iss ] i - 1 } , (Al) 
ii 

where we have defined I2 — 1 — li, and the inverses are 3x3 matrix inverses (using the {T, E, B} basis). Next we 
note that to first order in n, the correlation between two Fourier modes of temperature or polarization is: 

<x ll xi l2 ) = [fr] ll ^ l2 « 1 + 0(« 2 ). (A2) 
A general quadratic estimator for the convergence k is then constructed as: 

«i = -12 ,ii A ii - (A3) 

h 

where H_i is the weight matrix, which we assume without loss of generality to be Hermitian (since the anti-Hermitian 
part does not contribute to k\). We further require it to satisfy [S—iJ—i^ij = [Si]j 2 _ t . (This guarantees that the 
estimate k is a real field.) We can construct the optimally weighted unbiased (to first order) estimator for k by 
minimizing the variance of the estimator (neglecting the trispectrum contribution): 

Vt = (|m| 2 )l55 « E Tr {[(C>Lss] J2 [S_ 1 ]_ l2 , ll [(C) LSS ] Jl [H 1 ] ll ,_ la } , (A4) 
ii 

subject to the constraint that the estimator be unbiased to first-order (i.e. have unit response): 

l = ^2Tr{[fC] h ,- h [^-i]-h,h}- (A5) 
ii 

We may compute the minimum of Eq. (|A4I) constrained by Eq. (|A5|) using the method of Lagrange multipliers. The 
equation 5Vi + \~ 1 5l = becomes: 

]T Tr {2[<C) LS 5] ;2 pS-iU* [(C)Lss] h [Zi] h ,- h + ^-[ff k-i 2 [<S3_i]_ Wl } = 0. (A6) 
ii J 

The solution to this (allowing <5S_i to be arbitrary) is: 

[S-iU,h = ^[(C)Lss]7 2 % K }^-h[(C)Lss}r i 1 - (A7) 

The correct normalization A is obtained by substitution into Eq. (|A5|) ; it is easily seen tobeA = F/ quad) . The variance 
of this estimator in the absence of lensing, determined by substitution into Eq. I|A4(1 . is iyf ; ( quad \ xhe quadratic 
estimator we have used, Eq. (|37|l . is then seen to be a Wiener-filtered version of Eq. 1A7(1 with the optimized choice 
for 5, Eq. JXJJ, and its covariance Eq. (|39|l then follows from the theory of Wiener-filtering. 

Hu and Okamoto 01 derive a quadratic estimator using essentially the same method outlined in this appendix. 
While they have chosen to separately optimize the different components of 5 (TT, TE, EE, TB, and EB) and then 
combine these to form a "minimum variance" estimator, the end result of the optimal filtering must be the same. 
(Note that while our covariance response function f is the same as Hu and Okamoto's f aside from a factor of I 2 /2 
due to use of $ vs. k as the fundamental field, we have used 5 in place of their F/A to avoid confusion with the 
Fisher matrix.) 

APPENDIX B: LENSING B-MODES AND IDEALIZED RECONSTRUCTION 

The purpose of this Appendix is to investigate the question of whether, in the absence of noise and field rotation, the 
equation -B un ionsod = could be used to completely reconstruct the convergence field. We show that with probability 
1, it is possible to reconstruct most of the convergence modes. There may remain a small number of convergence 
modes that cannot be reconstructed by this method. If we impose periodic boundary conditions, the fraction of 
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the convergence modes that are in this category is at most of order 1/Z max ; however, there may be fewer of these 
degenerate modes, or possibly none at all. We have not investigated more realistic survey topologies but we would 
expect the general result to be similar on scales small compared to the angular width of the survey. However, this 
seems mostly an academic point since zero noise is of course unrealistic, and there can be many almost-degenerate 
modes that spoil a reconstruction based on -B U nionscd = 0. 
The B-modc induced by lensing is, to first order: 

Basing = ^= X] ^)l'-(l-l')sin2a£ 1 _ FK ,=^V«i', (Bl) 

where T is a transfer matrix that is a linear function of E (and once again a = (f>\ — (j>v). In the absence of noise, 
we may set -Bijensing equal to the observed polarization B\. We see that if N Fourier modes are considered, there are 
N linear equations for N unknowns K\. (We do not consider 1 = modes since there does not exist a kq mode, and 
lensing has no effect on zero-wavenumber CMB modes.) Thus any convergence mode that cannot be reconstructed 
must be associated with a degenerate direction of T. It is clear that for some realizations of the primary CMB, e.g. 
E = 0, T is massively degenerate. We thus wish to explore whether these singular realizations are "likely" or have 
probability zero. We will assume here that Cf E > for all of the i^-modes so that "probability zero" and "measure 
zero" can be taken to be equivalent. 

In order to do this, we consider the characteristic polynomial of T: 

N 

P(A; E) = det(T - Al) = ^ a n {E)\ n . (B2) 

71=0 

The determinant of an N x N matrix is a polynomial of degree N in the entries of the matrix, hence each a n (E) 
is a polynomial of order at most N — n in the E\. We know from linear algebra that the roots and multiplicities of 
P(A; E) (viewed as a polynomial in A) are precisely the eigenvalues and multiplicities of T; in particular, the number 
of degenerate (A = 0) modes is equal to the smallest value of n for which a n (E) ^ 0. Now suppose it were the case for 
some n that a n {E) ^ with nonzero probability (recall that the primary CMB polarization E is a random variable). 
This implies that a n (E) ^ is generic, i.e. only a small (measure zero) set of values of E give a n (E) = 0. The 
significance of this result is that if we can exhibit even one possible polarization field E for which a n (E) ^ 0, it follows 
that a n {E) ^ with unit probability for the real primary CMB polarization field. A similar statement holds for the 
number of degenerate convergence modes: if we can exhibit a possible polarization field with n degenerate modes, 
then it follows that with unit probability, the lensing field as reconstructed from the real CMB will have at most n 
degenerate modes. Conceptually, this means that the generic lensing reconstruction using B un ] e nsed = cannot be 
more degenerate than any special case we exhibit. (It may, however, be less degenerate.) 

We consider here the following very simple realization: take a sky with area Air, square (with side length v4tt), 
and with periodic boundary conditions. Suppose that only the iJ-mode i?L where L = (27r/\/47r, 0) (i.e. the longest- 
wavelength mode in the x-direction) is nonzero. Now consider a degenerate convergence mode, i.e. one that does not 
contribute to -Biensing- From Eq. I|B1|) . we see that the B = requirement forces all of the convergence modes Ky 
to be zero except those for which sin 2a = 0, i.e. those for which 1' is either parallel to or perpendicular to 1' + L. 
The latter is impossible given the boundary conditions and the former requires 1' to lie in the x-direction. Thus, out 
of O(^ax) convergence modes, only the 0(l max ) modes with wavevector in the x-direction cannot be reconstructed. 
Hence no more than a fraction 0(l// max ) of the convergence modes are degenerate (i.e. cannot be reconstructed 
from -Buniensed = 0), and by the argument of the previous paragraph this must hold with probability 1 for the actual 
realization of the primary polarization field. Note that this is only an upper limit and the actual number of degenerate 
modes may be smaller, or even zero. 

The problem of lensing reconstruction using -B U nicnscd — has been considered previously using real-space methods 
by Ref. They derive the following equation for the lensing-induced B-mode: 

^V^cnsing - 7t/V 2 Q - 7 qV 2 C/ + V 7[ / ■ VQ - V 7Q ■ VET, (B3) 

where (as above) jq and ju are second derivatives of <f>. This is therefore a third-order partial differential equation for 
3>. Ref. [5j then performs a two-dimensional Taylor-expansion of $ and finds that some of the coefficients are not fixed 
by Eq. (|B3|I . They thus determine that there exists a class of lensing potential modes that do not produce B- modes 
purely by counting the number of equations and the number of variables to be calculated. The relationship between 
our approach and that of Ref. is that we express T in the Fourier basis (Eq. IB1|I . whereas they have expressed T 
in the Taylor polynomial {x^ i y fe }^ c=0 basis. These bases are not equivalent due to the differing boundary conditions 
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assumed (the Fourier basis imposes periodic boundary conditions whereas the Taylor polynomial basis does not), and 
this leads to different conclusions regarding the number and character of degenerate modes. In the Fourier basis there 
are the same number of equations as variables. Thus one cannot conclude from the counting argument alone that there 
is a degeneracy in the case of full-sky coverage. (The analysis on the sphere would be slightly more intricate since 
the sky is curved and the boundary conditions have different topology than in the flat-sky approximation; however, 
on scales I 3> 1 small compared to the curvature scale, we expect that the results presented here will still apply.) 



APPENDIX C: CURVATURE CORRECTIONS 



Our purpose in this Appendix is to investigate in greater detail the mathematical structure of the curvature 
corrections, i.e. the increase in uncertainty in lensing reconstruction due to fluctuations of the curvature matrix. We 
show that the curvature correction has another interpretation: it represents the increased noise in the reconstruction 
of m due to the presence of other lensing modes, Ky (where 1' ^ ±1), an effect studied in detail in Kesden et al plf. 
where it was called the "first-order noise contribution" and denoted by JVjJL X „ X ,,,(L). Here we show that in fact 

the curvature correction contains the likelihood analysis manifestation of the iV' 1 ) of Ref. pH ]. 

We can compute the curvature corrections to this as follows. If we define ST = T — F, and retain our approximation 
from Eq. I|32(l that F is independent of k, then we have: 



s = ((st + F + c h 



)lss ~ S + S (ST S SJ^lssSq 



(CI) 



where So = (F + C* [To derive this equation, we have merely Taylor-expanded in ST, then taken 

the CMB+noise+LSS ensemble average, and noted that by definition ST vanishes when ensemble-averaged over 
CMB+noise realizations. Note that because we have taken the expectation value, Eq. IjClfl should be viewed as an 
asymptotic expansion rather than a Taylor expansion.] The mean squared error S picks up "curvature correction" 
terms involving ST that cause it to not equal the naive result So- Note that curvature corrections to S only increase 
S, they cannot decrease it (in the sense that S — So has all eigenvalues > 0; equivalently the diagonal elements 
Sjj > [So]ji m all orthonormal bases). This is true to all orders in ST because the inverse of the mean of a set of 
positive definite Hermitian matrices is smaller than the mean of the inverses (in this same sense). 

In order to compute the second-order curvature correction explicitly, we must understand the fluctuations in the 
curvature matrix. For simplicity, we evaluate ST at K = 0. In this case, we find: 



T 



i,i' 



T[K\, Ky 



d 2 



- lndet C„ 
2 9 




dn\dK\ 

dc . dc 



<9k_i Buy dn-idny 



d 2 C 



L x - -Tr 
2 



dC 6 _ x dC 
8k-\ dny 



d 2 c 



dK-idKy 



(C2) 



where the second line has the C's evaluated at k = 0. Subtracting out the average value F\ t y over CMB+noise 
realizations yields: 



ST,y = -ix t C- 1 J ( _ u0 C- 1 x+iTr(c 'j, ir 



(-M') I > 



(C3) 



where we have defined: 



K=0 



d 2 C 



dny dny dn-\dny 



K=0 



(C4) 



Note that J(_i ) i') = J(i'.-i) = j|j _ v y 

Using this relation, and noting that at k = we have dC/dn\ = fj K , we can use Wick's theorem to compute: 



(5T h y5T luVi ) = -Tr C- 1 J ( _ u0 C- 1 J ( _ 1 ,, i; 



(C5) 



In the case of statistically isotropic noise, Eq. (|C5|) allows us to compute the covariance of the reconstruction S using 
Eq. (|C1(I . In harmonic space, the off-diagonal elements of S vanish by symmetry whereas the diagonal elements are 
given by: 
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Si = [So]i + \[S }f ^[So^Tr (c- 1 J(-i,, C- 1 J ( _ 1M )) , (C6) 
r 

which is a summation over quadrilateral configurations of the modes 1, 1', and the mode over which we sum when 
computing the trace. 

To lowest order (SF 2 ), the curvature correction is given by Eq. (|C6|) . The correction to the mean inverse curvature 
Vi, i.e. to the covariance matrix of an unbiased estimator for k, is related to the correction to Si by noting that 
Sf 1 = V^ 1 + Cf K ~ 1 , hence: 



AVi = ^AS t « \v? ^[5 ]/'Tr (c" 1 J ( _ 1) , C- 1 J ( _ Vil) ) . (C7) 



We now pass to the "linear approximation" in which the second derivative 9 2 C/9k^9ki' is neglected. (This was found 
to be a valid approximation for temperature-based lensing estimation on scales I < 3500 |17|. although it is unclear 
whether this is also true in the present context.) Substituting Eq. IjCMJI then yields: 

AVi » -V? ^[Sbh'Tr (c- 1 f/ s C- 1 f/?C- x f!! 1 C- x C! 1 , + C~ x ff Cr^Cr^C" 1 ^,) . (C8) 
i' 

This should be compared with the first-order noise contribution of Ref. [2l|. In our notation, and written in 
terms of the convergence rather than the potential, their Eq. (25) can be re-written with the help of some algebra 
and the relation f" = f*j as: 

tf$ )Tr («0 = 1^2^ cy ^[fl£,-iJ/^ (C9) 

2 i' h [( c )iss]^[(C>LSs]r 2 [( c )iss]r 2+ i'[(C)LS5]r i _r 

This is the first term of Eq. l|C8p . except that it only includes temperature information (hence we have multiplication 
of numbers rather than 3x3 matrices), and the residual power spectrum \Sn]i> has been replaced with the raw 
convergence power spectrum Cf, K . (The latter difference arises because Ref. [2l| computed the first-order noise ATM 
for the quadratic estimator; when the Bayesian estimator is used, the contaminating modes {kv} have their power 
reduced from C'f, H to [So]/' since the estimated lensing field k is used to de-lens the CMB.) Thus we see that the 
first-order noise arises in the likelihood formalism as a curvature correction, which is not taken into account in the 
Fisher matrix for lensing reconstruction. 

The question naturally arises as to the interpretation of the second term in the curvature correction, Eq. (|C8() . We 
note that within the linear approximation, 

(crVss = c^ 0) + ^[^rC^f^c^r^c^. (cio) 

r 

The second term of Eq. i|C8(l is thus seen to be the correction to the Fisher matrix, Eq. (|32|l , due to the lensing effect 
on the CMB power spectrum. 
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